Search CORE

43 research outputs found

Handwritten Isolated Bangla Compound Character Recognition: a new benchmark using a novel deep learning approach

Author: Das Nibaran
Kundu Mahantapas
Nasipuri Mita
Roy Saikat
Publication venue: 'Elsevier BV'
Publication date: 02/02/2018
Field of study

In this work, a novel deep learning technique for the recognition of handwritten Bangla isolated compound character is presented and a new benchmark of recognition accuracy on the CMATERdb 3.1.3.3 dataset is reported. Greedy layer wise training of Deep Neural Network has helped to make significant strides in various pattern recognition problems. We employ layerwise training to Deep Convolutional Neural Networks (DCNN) in a supervised fashion and augment the training process with the RMSProp algorithm to achieve faster convergence. We compare results with those obtained from standard shallow learning methods with predefined features, as well as standard DCNNs. Supervised layerwise trained DCNNs are found to outperform standard shallow learning models such as Support Vector Machines as well as regular DCNNs of similar architecture by achieving error rate of 9.67% thereby setting a new benchmark on the CMATERdb 3.1.3.3 with recognition accuracy of 90.33%, representing an improvement of nearly 10%

arXiv.org e-Print Archive

A two-pass fuzzy-geno approach to pattern classification

Author: Basu Dipak Kumar
Basu Subhadip
Kundu Mahantapas
Nasipuri Mita
Publication venue
Publication date: 15/10/2014
Field of study

The work presents an extension of the fuzzy approach to 2-D shape recognition [1] through refinement of initial or coarse classification decisions under a two pass approach. In this approach, an unknown pattern is classified by refining possible classification decisions obtained through coarse classification of the same. To build a fuzzy model of a pattern class horizontal and vertical fuzzy partitions on the sample images of the class are optimized using genetic algorithm. To make coarse classification decisions about an unknown pattern, the fuzzy representation of the pattern is compared with models of all pattern classes through a specially designed similarity measure. Coarse classification decisions are refined in the second pass to obtain the final classification decision of the unknown pattern. To do so, optimized horizontal and vertical fuzzy partitions are again created on certain regions of the image frame, specific to each group of similar type of pattern classes. It is observed through experiments that the technique improves the overall recognition rate from 86.2%, in the first pass, to 90.4% after the second pass, with 500 training samples of handwritten digits

arXiv.org e-Print Archive

Segmentation of Offline Handwritten Bengali Script

Author: Basu Dipak K.
Basu Subhadip
Chaudhuri Chitrita
Kundu Mahantapas
Nasipuri Mita
Publication venue
Publication date: 14/02/2012
Field of study

Character segmentation has long been one of the most critical areas of optical character recognition process. Through this operation, an image of a sequence of characters, which may be connected in some cases, is decomposed into sub-images of individual alphabetic symbols. In this paper, segmentation of cursive handwritten script of world's fourth popular language, Bengali, is considered. Unlike English script, Bengali handwritten characters and its components often encircle the main character, making the conventional segmentation methodologies inapplicable. Experimental results, using the proposed segmentation technique, on sample cursive handwritten data containing 218 ideal segmentation points show a success rate of 97.7%. Further feature-analysis on these segments may lead to actual recognition of handwritten cursive Bengali script.Comment: Proceedings of 28th IEEE ACE, pp. 171-174, December 2002, Science City, Kolkat

arXiv.org e-Print Archive

Classification of Log-Polar-Visual Eigenfaces using Multilayer Perceptron

Author: Basu Dipak Kumar
Bhattacharjee Debotosh
Bhowmik Mrinal Kanti
Kundu Mahantapas
Nasipuri Mita
Publication venue
Publication date: 05/07/2010
Field of study

In this paper we present a simple novel approach to tackle the challenges of scaling and rotation of face images in face recognition. The proposed approach registers the training and testing visual face images by log-polar transformation, which is capable to handle complicacies introduced by scaling and rotation. Log-polar images are projected into eigenspace and finally classified using an improved multi-layer perceptron. In the experiments we have used ORL face database and Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database for visual face images. Experimental results show that the proposed approach significantly improves the recognition performances from visual to log-polar-visual face images. In case of ORL face database, recognition rate for visual face images is 89.5% and that is increased to 97.5% for log-polar-visual face images whereas for OTCBVS face database recognition rate for visual images is 87.84% and 96.36% for log-polar-visual face images

arXiv.org e-Print Archive

An adaptive block based integrated LDP,GLCM,and Morphological features for Face Recognition

Author: Basu Dipak Kumar
Bhattacharjee Debotosh
Kar Arindam
Kundu Mahantapas
Nasipuri Mita
Publication venue
Publication date: 05/12/2013
Field of study

This paper proposes a technique for automatic face recognition using integrated multiple feature sets extracted from the significant blocks of a gradient image. We discuss about the use of novel morphological, local directional pattern (LDP) and gray-level co-occurrence matrix GLCM based feature extraction technique to recognize human faces. Firstly, the new morphological features i.e., features based on number of runs of pixels in four directions (N,NE,E,NW) are extracted, together with the GLCM based statistical features and LDP features that are less sensitive to the noise and non-monotonic illumination changes, are extracted from the significant blocks of the gradient image. Then these features are concatenated together. We integrate the above mentioned methods to take full advantage of the three approaches. Extraction of the significant blocks from the absolute gradient image and hence from the original image to extract pertinent information with the idea of dimension reduction forms the basis of the work. The efficiency of our method is demonstrated by the experiment on 1100 images from the FRAV2D face database, 2200 images from the FERET database, where the images vary in pose, expression, illumination and scale and 400 images from the ORL face database, where the images slightly vary in pose. Our method has shown 90.3%, 93% and 98.75% recognition accuracy for the FRAV2D, FERET and the ORL database respectively.Comment: 7 pages, Science Academy Publisher, United Kingdo

arXiv.org e-Print Archive

High Performance Human Face Recognition using Gabor based Pseudo Hidden Markov Model

Author: Basu Dipak Kumar
Bhattacharjee Debotosh
Kar Arindam
Kundu Mahantapas
Nasipuri Mita
Publication venue
Publication date: 05/12/2013
Field of study

This paper introduces a novel methodology that combines the multi-resolution feature of the Gabor wavelet transformation (GWT) with the local interactions of the facial structures expressed through the Pseudo Hidden Markov model (PHMM). Unlike the traditional zigzag scanning method for feature extraction a continuous scanning method from top-left corner to right then top-down and right to left and so on until right-bottom of the image i.e. a spiral scanning technique has been proposed for better feature selection. Unlike traditional HMMs, the proposed PHMM does not perform the state conditional independence of the visible observation sequence assumption. This is achieved via the concept of local structures introduced by the PHMM used to extract facial bands and automatically select the most informative features of a face image. Thus, the long-range dependency problem inherent to traditional HMMs has been drastically reduced. Again with the use of most informative pixels rather than the whole image makes the proposed method reasonably faster for face recognition. This method has been successfully tested on frontal face images from the ORL, FRAV2D and FERET face databases where the images vary in pose, illumination, expression, and scale. The FERET data set contains 2200 frontal face images of 200 subjects, while the FRAV2D data set consists of 1100 images of 100 subjects and the full ORL database is considered. The results reported in this application are far better than the recent and most referred systems.Comment: 9 pages. arXiv admin note: substantial text overlap with arXiv:1312.151

arXiv.org e-Print Archive

Face Synthesis (FASY) System for Generation of a Face Image from Human Description

Author: Basu Dipak Kumar
Bhattacharjee Debotosh
Halder Santanu
Kundu Mahantapas
Nasipuri Mita
Publication venue
Publication date: 21/05/2010
Field of study

This paper aims at generating a new face based on the human like description using a new concept. The FASY (FAce SYnthesis) System is a Face Database Retrieval and new Face generation System that is under development. One of its main features is the generation of the requested face when it is not found in the existing database, which allows a continuous growing of the database also

arXiv.org e-Print Archive

Handwritten Bangla Basic and Compound character recognition using MLP and SVM classifier

Author: Basu Subhadip
Das Bindaban
Das Nibaran
Kundu Mahantapas
Nasipuri Mita
Sarkar Ram
Publication venue
Publication date: 23/02/2010
Field of study

A novel approach for recognition of handwritten compound Bangla characters, along with the Basic characters of Bangla alphabet, is presented here. Compared to English like Roman script, one of the major stumbling blocks in Optical Character Recognition (OCR) of handwritten Bangla script is the large number of complex shaped character classes of Bangla alphabet. In addition to 50 basic character classes, there are nearly 160 complex shaped compound character classes in Bangla alphabet. Dealing with such a large varieties of handwritten characters with a suitably designed feature set is a challenging problem. Uncertainty and imprecision are inherent in handwritten script. Moreover, such a large varieties of complex shaped characters, some of which have close resemblance, makes the problem of OCR of handwritten Bangla characters more difficult. Considering the complexity of the problem, the present approach makes an attempt to identify compound character classes from most frequently to less frequently occurred ones, i.e., in order of importance. This is to develop a frame work for incrementally increasing the number of learned classes of compound characters from more frequently occurred ones to less frequently occurred ones along with Basic characters. On experimentation, the technique is observed produce an average recognition rate of 79.25 after three fold cross validation of data with future scope of improvement and extension

arXiv.org e-Print Archive

A Face Recognition approach based on entropy estimate of the nonlinear DCT features in the Logarithm Domain together with Kernel Entropy Component Analysis

Author: Basu Dipak Kumar
Bhattacharjee Debotosh
Kar Arindam
Kundu Mahantapas
Nasipuri Mita
Publication venue: 'MECS Publisher'
Publication date: 01/08/2013
Field of study

This paper exploits the feature extraction capabilities of the discrete cosine transform (DCT) together with an illumination normalization approach in the logarithm domain that increase its robustness to variations in facial geometry and illumination. Secondly in the same domain the entropy measures are applied on the DCT coefficients so that maximum entropy preserving pixels can be extracted as the feature vector. Thus the informative features of a face can be extracted in a low dimensional space. Finally, the kernel entropy component analysis (KECA) with an extension of arc cosine kernels is applied on the extracted DCT coefficients that contribute most to the entropy estimate to obtain only those real kernel ECA eigenvectors that are associated with eigenvalues having high positive entropy contribution. The resulting system was successfully tested on real image sequences and is robust to significant partial occlusion and illumination changes, validated with the experiments on the FERET, AR, FRAV2D and ORL face databases. Experimental comparison is demonstrated to prove the superiority of the proposed approach in respect to recognition accuracy. Using specificity and sensitivity we find that the best is achieved when Renyi entropy is applied on the DCT coefficients. Extensive experimental comparison is demonstrated to prove the superiority of the proposed approach in respect to recognition accuracy. Moreover, the proposed approach is very simple, computationally fast and can be implemented in any real-time face recognition system.Comment: 9 pages,Published Online August 2013 in MECS. International Journal of Information Technology and Computer Science, 2013. arXiv admin note: text overlap with arXiv:1112.3712 by other author

arXiv.org e-Print Archive

Directory of Open Access Journals

Text Region Extraction from Business Card Images for Mobile Devices

Author: Basu Subhadip
Das Nibaran
Kundu Mahantapas
Mollah Ayatullah Faruk
Nasipuri Mita
Sarkar Ram
Publication venue
Publication date: 09/03/2010
Field of study

Designing a Business Card Reader (BCR) for mobile devices is a challenge to the researchers because of huge deformation in acquired images, multiplicity in nature of the business cards and most importantly the computational constraints of the mobile devices. This paper presents a text extraction method designed in our work towards developing a BCR for mobile devices. At first, the background of a camera captured image is eliminated at a coarse level. Then, various rule based techniques are applied on the Connected Components (CC) to filter out the noises and picture regions. The CCs identified as text are then binarized using an adaptive but light-weight binarization technique. Experiments show that the text extraction accuracy is around 98% for a wide range of resolutions with varying computation time and memory requirements. The optimum performance is achieved for the images of resolution 1024x768 pixels with text extraction accuracy of 98.54% and, space and time requirements as 1.1 MB and 0.16 seconds respectively.Comment: Proc. of International Conference on Information Technology and Business Intelligence (ITBI-09), pp.227-235, Nov 6-8, 2009, Nagpur, Indi

arXiv.org e-Print Archive